Fast Relational Data Mining Query Optimization for Improving the Efficiency of Relational Data Mining Systems

نویسنده

  • Jan Struyf
چکیده

Data mining is the process of building predictive or descriptive models based on a large data set, often stored in a relational database. Propositional data mining systems require that the data is converted into one single table. Relational data mining systems, on the other hand, can build models directly from the relational database. While building a model, relational data mining systems execute a huge number of queries on the database and this consumes much CPU time. In our work, we propose a number of query optimization techniques that speed up query execution. Relational data mining systems generate queries and access the data in a structured way. Our optimizations exploit this structure as much as possible. Keywords— Relational Data Mining, Machine Learning, Efficient Algorithms, Query Optimization

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Predator-Miner: Ad hoc Mining of Associations Rules within a Database Management System

In this demonstration, we present a prototype system, Predator-Miner, which extends Predator with an relationallike association rule mining operator to support data mining operations. Predator-Miner allows a user to combine association rule mining queries with SQL queries. This approach towards tight integration differs from existing techniques of using user-defined functions (UDFs), stored pro...

متن کامل

Set-Oriented Indexes for Data Mining Queries

One of the most popular data mining methods is frequent itemset and association rule discovery. Mined patterns are usually stored in a relational database for future use. Analyzing discovered patterns requires excessive subset search querying in large amount of database tuples. Indexes available in relational database systems are not well suited for this class of queries. In this paper we study...

متن کامل

Set-Oriented Data Mining in relational Databases

Data mining is an important real-life application for businesses. It is critical to find efficient ways of mining large data sets. In order to benefit from the experience with relational databases, a set-oriented approach to mining data is needed. In such an approach, the data mining operations are expressed in terms of relational or set-oriented operations. Query optimization technology can th...

متن کامل

Discovering and Exploiting Statistical Properties for Query Optimization in Relational Databases: A Survey

Discovering and exploiting statistical features in relational datasets is key to query optimization in a relational database management system (rdbms), and is also needed for database design, cleaning, and integration. This paper surveys a variety of methods for automatically discovering important statistical features such as correlations, functional dependencies, keys, and algebraic constraint...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003